Strategies for Efficiently Keeping Local Linked Open Data Caches Up-To-Date
نویسندگان
چکیده
Quite often, Linked Open Data (LOD) applications pre-fetch data from the Web and store local copies of it in a cache for faster access at runtime. Yet, recent investigations have shown that data published and interlinked on the LOD cloud is subject to frequent changes. As the data in the cloud changes, local copies of the data need to be updated. However, due to limitations of the available computational resources (e.g., network bandwidth for fetching data, computation time) LOD applications may not be able to permanently visit all of the LOD sources at brief intervals in order to check for changes. These limitations imply the need to prioritize which data sources should be considered first for retrieving their data and synchronizing the local copy with the original data. In order to make best use of the resources available, it is vital to choose a good scheduling strategy to know when to fetch data of which data source. In this paper, we investigate different strategies proposed in the literature and evaluate them on a large-scale LOD dataset that is obtained from the LOD cloud by weekly crawls over the course of three years. We investigate two different setups: (i) in the single step setup, we evaluate the quality of update strategies for a single and isolated update of a local data cache, while (ii) the iterative progression setup involves measuring the quality of the local data cache when considering iterative updates over a longer period of time. Our evaluation indicates the effectiveness of each strategy for updating local copies of LOD sources, i. e, we demonstrate for given limitations of bandwidth, the strategies’ performance in terms of data accuracy and freshness. The evaluation shows that the measures capturing change behavior of LOD sources over time are most suitable for conducting updates.
منابع مشابه
Addendum to “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches”
Abstract The MICRO 2011 paper “Efficiently Enabling Conventional Block Sizes for Very Large Die-stacked DRAM Caches” proposed a novel die-stacked DRAM cache organization embedding the tags and data within the same physical DRAM row and then using compound access scheduling to manage the hit latency and a MissMap structure to make misses more efficient. This addendum provides a revised performan...
متن کاملAn Investigation of HTTP Header Information for Detecting Changes of Linked Open Data Sources
Data on the Linked Open Data (LOD) cloud changes frequently. Applications that operate on local caches of Linked Data need to be aware of these changes. In this way they can update their cache to ensure operating on the most recent version of the data. Given the HTTP basis recommended in the Linked Data guidelines, the native way of detecting changes would be to use HTTP header information, suc...
متن کاملLocal Disk Caching for Client-Server Database Systems
Client disks ure a valuable resource that ure not adequately exploited by current client-server ciatabase systems. In this papeG we propose the use of client disks for caching database pages in an extended cache architecture. We describe four algorithms .fi)r managing disk caches and investigate the tradeoffs inherent in keeping a large volume of disk-cached data consistent using u detailed sim...
متن کاملPerformance of Route Caching Strategies in Dynamic Source Routing
On-demand routing protocols for mobile ad hoc networks utilize route caching in different forms in order to reduce the routing overheads as well as to improve the route discovery latency. For route caches to be effective, they need to adapt to frequent topology changes. Using an ondemand protocol called “Dynamic Source Routing” (DSR), we study the problem of keeping the caches up-to-date in dyn...
متن کاملInformation-theoretic Analysis of Entity Dynamics on the Linked Open Data Cloud
The Linked Open Data (LOD) cloud is expanding continuously. Entities appear, change, and disappear over time. However, relatively little is known about the dynamics of the entities, i. e., the characteristics of their temporal evolution. In this paper, we employ clustering techniques over the dynamics of entities to determine common temporal patterns. We define an entity as RDF resource togethe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015